25 research outputs found

    Designing Semantic Kernels as Implicit Superconcept Expansions

    Get PDF
    Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel

    Identification and analysis of conserved pockets on protein surfaces

    Get PDF
    BACKGROUND: The interaction between proteins and ligands occurs at pockets that are often lined by conserved amino acids. These pockets can represent the targets for low molecular weight drugs. In order to make the research for new medicines as productive as possible, it is necessary to exploit "in silico" techniques, high throughput and fragment-based screenings that require the identification of druggable pockets on the surface of proteins, which may or may not correspond to active sites. RESULTS: We developed a tool to evaluate the conservation of each pocket detected on the protein surface by CastP. This tool was named DrosteP because it recursively searches for optimal input sequences to be used to calculate conservation. DrosteP uses a descriptor of statistical significance, Poisson p-value, as a target to optimize the choice of input sequences. To benchmark DrosteP we used monomeric or homodimer human proteins with known 3D-structure whose active site had been annotated in UniProt. DrosteP is able to detect the active site with high accuracy because in 81% of the cases it coincides with the most conserved pocket. Comparing DrosteP with analogous programs is difficult because the outputs are different. Nonetheless we could assess the efficacy of the recursive algorithm in the identification of active site pockets by calculating conservation with the same input sequences used by other programs. We analyzed the amino-acid composition of conserved pockets identified by DrosteP and we found that it differs significantly from the amino-acid composition of non conserved pockets. CONCLUSIONS: Several methods for predicting ligand binding sites on protein surfaces, that combine 3D-structure and evolutionary sequence conservation, have been proposed. Any method relying on conservation mainly depends on the choice of the input sequences. DrosteP chooses how deeply distant homologs must be collected to evaluate conservation and thus optimizes the identification of active site pockets. Moreover it recognizes conserved pockets other than those coinciding with the sites annotated in UniProt that might represent useful druggable sites. The distinctive amino-acid composition of conserved pockets provides useful hints on the fundamental principles underlying protein-ligand interaction. AVAILABILITY: http://www.icb.cnr.it/project/drosteppy

    Prediction of the responsiveness to pharmacological chaperones: lysosomal human alpha-galactosidase, a case of study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The pharmacological chaperones therapy is a promising approach to cure genetic diseases. It relies on substrate competitors used at sub-inhibitory concentration which can be administered orally, reach difficult tissues and have low cost. Clinical trials are currently carried out for Fabry disease, a lysosomal storage disorder caused by inherited genetic mutations of alpha-galactosidase. Regrettably, not all genotypes respond to these drugs.</p> <p>Results</p> <p>We collected the experimental data available in literature on the enzymatic activity of ninety-six missense mutants of lysosomal alpha-galactosidase measured in the presence of pharmacological chaperones. We associated with each mutation seven features derived from the analysis of 3D-structure of the enzyme, two features associated with their thermo-dynamic stability and four features derived from sequence alone. Structural and thermodynamic analysis explains why some mutants of human lysosomal alpha-galactosidase cannot be rescued by pharmacological chaperones: approximately forty per cent of the non responsive cases examined can be correctly associated with a negative prognostic feature. They include mutations occurring in the active site pocket, mutations preventing disulphide bridge formation and severely destabilising mutations. Despite this finding, prediction of mutations responsive to pharmacological chaperones cannot be achieved with high accuracy relying on combinations of structure- and thermodynamic-derived features even with the aid of classical and state of the art statistical learning methods.</p> <p>We developed a procedure to predict responsive mutations with an accuracy as high as 87%: the method scores the mutations by using a suitable position-specific substitution matrix. Our approach is of general applicability since it does not require the knowledge of 3D-structure but relies only on the sequence.</p> <p>Conclusions</p> <p>Responsiveness to pharmacological chaperones depends on the structural/functional features of the disease-associated protein, whose complex interplay is best reflected on sequence conservation by evolutionary pressure. We propose a predictive method which can be applied to screen novel mutations of alpha galactosidase. The same approach can be extended on a genomic scale to find candidates for therapy with pharmacological chaperones among proteins with unknown tertiary structures.</p

    A novel integrated industrial approach with cobots in the age of industry 4.0 through conversational interaction and computer vision

    Get PDF
    From robots that replace workers to robots that serve as helpful colleagues, the field of robotic automation is experiencing a new trend that represents a huge challenge for component manufacturers. The contribution starts from an innovative vision that sees an ever closer collaboration between Cobot, able to do a specific physical job with precision, the AI world, able to analyze information and support the decision-making process, and the man able to have a strategic vision of the future

    Radionuclide measurements as tool for geophysical studies on Mt. Etna Volcano (Sicily)

    Get PDF
    Radionuclide measurements as tool for geophysical studies on Mt. Etna Volcano (Sicily

    Genetic and epigenetic mutations affect the DNA binding capability of human ZFP57 in transient neonatal diabetes type 1

    Get PDF
    AbstractIn the mouse, ZFP57 contains three classical Cys2His2 zinc finger domains (ZF) and recognizes the methylated TGCmetCGC target sequence using the first and the second ZFs. In this study, we demonstrate that the human ZFP57 (hZFP57) containing six Cys2His2 ZFs, binds the same methylated sequence through the third and the fourth ZFs, and identify the aminoacids critical for DNA interaction. In addition, we present evidences indicating that hZFP57 mutations and hypomethylation of the TNDM1 ICR both associated with Transient Neonatal Diabetes Mellitus type 1 result in loss of hZFP57 binding to the TNDM1 locus, likely causing PLAGL1 activation

    Ontology-driven Information Retrieval in FF-Poirot

    No full text
    Abstract — This paper proposes a new approach for supporting domain information retrieval and information extraction on the web, using an original query expansion technique supported by an ad-hoc ontology focused on a specific domain of interest. The system has been built and tested in the framework of the FF-Poirot project, for supporting fine-grain retrieval from the Internet aiming at detecting financial fraudent sites. In a first stage, using a short list of keywords given by the user, the application mines the web retrieving relevant documents. These documents are then clustered into coherent groups focusing on specific subjects. The ontology model is devoted to represent the most important concepts of the domain of interest and to link them to the user need as expressed by the keywords. Once clusters of documents are made available after the first stage, the ontology can be used to extract from these clusters the most interesting documents (the most probable fraudolent sites in the framework of the FF-Poirot application). Browsing the ontology and selecting specific concepts, the user starts a query expansion engine that refines the search, creating a new query based on terminological evidences tied in the ontology to the selected concepts. The paper describes the overall software architecture of the application as used in the project, focusing specifically on the query exapansion engine and the supporting ontological model adopted. Experimental evidences, as emerged in FF-Poirot, will be used to prove the feasibility and the advantages of the adopted technique. I

    A similarity measure for unsupervised semantic disambiguation

    No full text
    This paper presents an unsupervised method for the resolution of lexical ambiguity of nouns. The method relies on the topological structure of the noun taxonomy of WordNet where a notion of semantic distance is defined. An unsupervised semantic tagger, based on the above measure, is evaluated over an hand-annotated portion of the British National Corpus and compared with a supervised approach based on the Maximum Entropy Model. 1

    Designing Semantic Kernels as Implicit Superconcept Expansions

    No full text
    Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels ’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel.
    corecore